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Abstract—Dempster-Shafer theory (DST) is an important the- 
ory for information fusion. However, in DST how to determinate 
the basic belief assignment (BBA) is still an open issue. The 
interval number based BBA determination method is simple and 
effective, where the features of different classes’ samples are 
modeled using the interval numbers, i.e., an interval number 
model is constructed for each focal element. Then, the distances 
of interval numbers are used for measuring the similarity degrees 
between the testing sample and each focal element, and the 
similarity degrees are used for determinating the BBA. The 
definition of interval numbers’ distance is crucial for the effective- 
ness of the interval number based BBA determination methods. 
In this paper, we use different interval numbers’ distances for 
determinating BBAs. By using the artificial data set and the Iris 
date set of open UCI data base, respectively, we compare and 
analyze the determination of BBAs with different distances. 

Index Terms—Dempster-Shafer theory, basic belief assignment, 
distance of interval numbers, information fusion, classification. 


I. INTRODUCTION 


Dempster-Shafer theory (DST) [1] was proposed by Demp- 
ster in 1960s, and was developed by Shafer [2]. In DST, the 
basic beliefs are assigned to the power set of the frame of 
discernment (FOD), which is used to describe the uncertain- 
ty of sources of evidence. The evidences (i.e., basic belief 
assignments, BBAs) originated from different sources can be 
fused using the Dempster’s combination rule [1]. DST has 
been widely used in the information fusion fields [3]-[5]. 

Using DST, the first step is to determinate the BBAs, which 
is still an open issue. The determination of BBAs can mainly 
categorized into two branches [6]: (1) The experts give the 
BBAs directly according to their personal experiences; (2) 
The BBAs are determinated based on the samples using some 
special determination rules. In the first branch, the determi- 
nation of BBAs relies on the experts’ subjective points of 
view. In this paper, we focus on the second branch approaches, 
i.e., the BBAs are determinated based on available samples. 
Researchers have proposed many approaches in this branch. 
Selzer et al. [3] determinated the BBAs based on the number 
of classes and the environmental weighting coefficient. Shafer 
[2] proposed a BBA determination method based on statistical 
evidences. Bi et al. [7] designed a kind of triple focal elements 
BBA in dealing with the text classification problem. Szlzen- 
stein et al. [8] used the Gaussian model getting the BBAs 
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through iterative estimation. Deng et al. [9] defined a similarity 
measure based on radius of gravity, and then the similarity 
measure is used for determinating the BBAs. Boudraa et al. 
[10] and Florea et al. [11] determinates the BBAs based on 
the membership functions. Han et al. [12] proposed a method 
for the transformation of fuzzy membership function into 
BBAs by solving a constrained maximization or minimization 
optimization problem. Recently, Kang et al. [6] designed a 
BBA determination method using the interval numbers. 

Kang’s interval number based BBA determination method 
is simple and effective. Kang’s method first constructs the 
interval number [14] models for each focal element (including 
the singleton focal elements with single class and the com- 
pound focal elements with multiple classes) based on the set of 
training samples. In Kang’s method, the Tran and Duckstein’s 
[14], [16] interval number distance (TD-IND) is used for mea- 
suring the similarity degree of the testing samples compared 
with different focal elements’ interval number models. In the 
final, the similarities are normalized to get the values of BBA. 
The definitions of the interval numbers’ distances (INDs) are 
crucial for the performance of the interval number based 
BBA determination method. There exist many possible choices 
for INDs, e.g., the Gowda and Ravi’s distance [15] (GR- 
IND), the Tran and Duckstein’s distance [16] (TD-IND), the 
Hausdorff distance [17] (H-IND) and the De Carvalho’s norm- 
q distance [18] (Nq-IND). In this paper, we implement the 
Kang’s interval number based method using different INDs. 
We analyze the differences of the BBAs determinated using 
different INDs based on numerical examples. Furthermore, we 
use Monte-Carlo experiments for comparing the performances 
of interval number based methods with different INDs by 
classifying an artificial set and the iris set’. 


II. BASIC OF DEMPSTER-SHAFER THEORY 


Dempster-Shafer theory (DST) (also known as the Evidence 
Theory) is an appealing mathematical framework which can 
effectively describe the uncertainty information for the state 
of nature. In DST, the frame of discernment (FOD) is denoted 
by © = {61,02,--- ,0,}. The elements in © are mutually 
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and exhaustive. The basic belief assignment (BBA) function 
assigns basic beliefs on the power set of O, i.e., 2°. The BBA 
is also called the mass function which satifies: 


X m(A) = 1,m() =0 (1) 
ACO 


If AC 8,m(A) > 0, A is called a focal element. 
The Belief (Bel) and Plausibility (Pl) of A are defined as: 


Bel(A) = X` m(B) (2) 
BCA 
PI(A)= X` m(B)=1- Bel (A) (3) 
BNA=0 


The interval [Bel (A) , Pl (A)] is call the belief interval, which 
represents the uncertainty of the support degree of A. 

Different information sources can provide different evi- 
dences, i.e., the BBAs. In DST, two BBAs associated with 
two distinct sources of evidence can be combined according 
to the Dempster’s rule, as in Eq. (4). 


X gnc-a Mı (B) ms (C) 
EN A#0 


0 A=06 





nity (4) 


where K = So p4¢_4™(B)m(C) denotes the conflicting 
coefficient. Dempster’s combination rule is both commutative 
and associative. 

To make a probabilistic decision, the fused BBA can be 
transformed into the probability using the Pignistic probability 
transformation: 

>D 


OEA, ACO 


me) , v0; € © (5) 


where |A| denotes the cardinality of A. 


III. KANG’S BBA DETERMINATION METHOD BASED ON 
THE INTERVAL NUMBERS’ DISTANCES 


Using the DST, the determination of the BBAs is the first 
step, which is an still a challenging task. Interval number, 
which can describe the uncertainty or insufficient information, 
is useful for determinating the BBAs. The definition of interval 
numbers is as follows: An interval number à in R is a 
set of real numbers that lie between two real numbers, i.e., 
a= lava") = {z| <x <a*},a™,a™ € R and a` < 
at. Kang et al. [6] proposed a BBA determination method 
based on the interval number models, where the basic beliefs 
assigned to different focal elements are determinated based 
on the interval numbers’ distances between the testing sample 
and the interval number models of focal elements. Here, we 
recall the Kang’s interval number based BBA determination 
method first. 

Kang’s method determinates BBAs on different single fea- 
tures respectively. In a single feature, Kang’s method models 
different focal elements (including the focal elements with 
single class and the focal elements with multiple classes) 
using interval numbers, and the testing sample is treated as 


a degenerate interval (a precise number) with a zero length. 
Kang’ method measures the distances between the testing 
sample and different interval number models of the focal 
elements. The testing sample should have a higher similarity 
degree with the focal element when the distance is small, and 
the corresponding focal element is assigned a higher basic 
belief. The steps of Kang’s method are described as follows: 


1) The interval number models of the focal elements with 
single class are constructed by finding the minimum 
and the maximum of the corresponding classes’ training 
samples. Then, the interval number models of the focal 
elements with mixture classes are obtained by finding 
the overlapping region of the corresponding single class- 
es’ interval number models. The interval number models 
of different focal elements are denoted by by, fe. 

2) Calculate the distances between the testing sample (de- 
noted by &) and different focal elements’ interval number 
models, i.e., D (| ã, by) , Vf € 2°. Note that the length 
of a is 0, i.e., a 

3) Calculate the similarity degree based on the distances 
according to Eq. (6). 


=a: . 


eee. 1 
nauj 1+aD (abp) m 


where a > 0 is the support coefficient. Empirically, it 
is proper to set a = 5 [6]. 

4) The BBA is determinated by normalizing the similarity 
degrees of all the focal elements. 


Kang’s method define the similarity degrees using interval 
numbers’ distance, and the BBAs are obtained by normalizing 
the similarity degrees. Thus, the definition of the IND (i.e., the 
D |à, by ) is crucial for this method. The differences of the 
BBAs determinated by Kang’s method using different INDs 
are compared in the next section. 


IV. COMPARISONS OF INTERVAL NUMBER BASED BBA 
DETERMINATION METHOD USING DIFFERENT INDS 


As aforementioned, the definition of the IND is crucial for 
the interval number based BBA determination methods. Many 
INDs have been proposed. Here, we introduce four widely 
used INDs. 


A. Introduction of the interval number’s distances 


Suppose & = [a~,at] and b = [b7, bt] are two interval 
numbers. Then [13], [14], č = @@6 = [c,c*], where 
cT = min(a~,b~) and ct = max(at,bt). The length (or 


width) of the interval number @ is w(@) = at —a~. Da 
is the length of the domain [14] of the interval numbers. To 
measure the difference between two interval numbers, many 
interval numbers’ distances (INDs) have been proposed. Here, 
we introduce four widely used INDs, which are introduced as 
follows: 
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Gowda and Ravi (1995) [15]: In 1995 Gowda and Ravi 
proposed a metric (denoted by GR-IND) combining a position 
and a size component, as follows 


Dor (a, i) =D, (a, b) +D, (a, b) (7) 


where the position component is defined as, 


nedele] o 


and the size component is defined as 


; (a) + p (b 
Ds (a,b) = cos Q xZ (9) 
2x p (a e b) 2 
Tran and Duckstein (2002) [16]: In the framework of 


fuzzy data analysis, Tran and Duckstein proposed the interval 
numbers’ distance (TD-IND): 


vio (ai)=[ [i {fe tse] 
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Hausdorff distance [17]: Considering two sets A and B 
of points of R”, and a distance d (x,y), where x € A and 
y € B. The Hausdorff distanc (H-IND) is defined as follows: 


Dy (A, B) = max (sup inf d (x,y), sup inf a(ew)) 
rE AYEB yeB zeA 
(11) 


If d (x, y) is the Manhattan distance (also called the City block 
distance), i.e., d(x,y) = |x — y|, then Chavent et al. (2002) 
proved that 


Dy (a.d) = max (la~ — b | ; lat — b*]) 


De Carvalho et al. (2006) [18]: A family of distances 
between interval numbers has been proposed by De Carvalho 
et al. based on the bounds of interval numbers. The metric of 
norm-q (Nq-IND) is defined as: 


Dw, (4,5) = (la~ = or" + Jat = 0+1)? 


(12) 





(13) 


B. Numerical example 


Different INDs can be used for implementing the BBA 
determinations. Here, we use a numerical example for com- 
paring the interval number based BBA determination methods 
using different INDs. The BBA determination methods using 
different INDs are applied on a three-classes classification 
problem. In this numerical example, we give the features’ 
ranges of different classes directly, as shown in Figure 1, where 
the feature’s range of class 1 (01) is [1,4], class 2 (82) is [3,7] 
and class 3 (63) is [5,8]. 

From the Figure 1, the interval numbers models of focal 
elements can be constructed, which is listed in Table I. Note 
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Fig. 1. Feature values’ ranges of different classes 








TABLE I 
THE INTERVAL NUMBERS MODELS OF FOCAL ELEMENTS. 
Focal elements | Interval number model 
{81} [1,4] 
{62} [3, 7] 
{63} [5,8] 
{01,42} [3, 4] 
{02,03} [5,7] 
{01,03} N/A 
{01, 02, 03} N/A 








that in this example {01,03} and {01,02,03} do not have 
interval number models, because the {0 }’s and {03}’s interval 
number models do not have overlapping region. 

Suppose we have a testing sample whose feature value is 2, 
i.e., @ = [2,2], as the purple dot on X-axis of Figure 1. Then 
we use different INDs, i.e., the GR-IND as in Eq. (7), the TD- 
IND as in Eq. (10), the H-IND as in Eq. (12), and the Nq-IND 
as in Eq. (13) (with q = 2 in Nq-IND), for measuring the 
distance between the a and different focal elements’ interval 
number models, respectively. The distances are listed in Table 
II. 








TABLE II 
THE INDS BETWEEN THE @ AND FOCAL ELEMENTS’ INTERVAL NUMBER 

MODELS. 

Focal elements | GR-IND | TD-IND | H-IND | Nq-IND 

{61} 0.9296 1.0000 2.0000 | 2.2361 

{62} 1.0315 3.2146 5.0000 | 5.0990 

{63} 1.5474 4.5826 6.0000 | 6.7082 

{01,92} 1.1464 1.5275 2.0000 | 2.2361 

{62,03} 1.5745 4.0415 5.0000 | 5.8310 

















Then, using the distances the similarity degrees are calculated 
according to Eq. (6), where the support coefficient is set to 
a = 5. By normalizing the similarity degrees the BBAs are 
obtained as listed in Table III. 

As the BBAs in Table III, the basic beliefs assigned to 
different focal elements have small differences using GR- 
IND compared with that using TD-IND, H-IND and Nq- 
IND. For example, using GR-IND the basic beliefs assigned 
to {0;} and {02} are 0.2552 and 0.2305, which have small 
differences. Using TD-IND, the basic beliefs of {01 } and {02} 
are 0.4086 and 0.1289, whose difference is larger. The BBAs 











TABLE M 
THE BBAS DETERMINATED BASED ON DIFFERENT INDs. 
BBAs 
Focal elements 
GR-IND | TD-IND | H-IND | Nq-IND 

{61} 0.2552 0.4086 0.3184 | 0.3163 
{62} 0.2305 0.1289 0.1281 0.1394 
{63} 0.1546 0.0906 0.1069 | 0.1061 
{61,02} 0.2078 0.2693 0.3184 | 0.3162 
{62,03} 0.1519 0.1026 0.1282 | 0.1220 

















determinated based on H-IND and Nq-IND are similar to each 
other. 

Here, we use the Pignistic probability transformation (as in 
Eq. (5)) for transforming the BBAs to probabilities for decision 
making. The probabilities of the testing sample belonging to 
different classes are listed in Table IV. 


TABLE IV 
THE PIGNISTIC PROBABILITIES OBTAINED BASED ON DIFFERENT INDsS. 











Pignistic probabilities 
Classes 
GR-IND | TD-IND | H-IND | Nq-IND 
Class 1 (01) | 0.3591 0.5433 0.4777 | 0.4744 
Class 2 (02) | 0.4103 0.3148 0.3514 | 0.3585 
Class 3 (03) | 0.2306 0.1419 0.1709 | 0.1671 

















Intuitively, the testing sample belongs more likely to class 
1, as shown in Fig. 1. According to Table IV, the methods 
using the TD-IND, H-IND and Nq-IND all can make right 
classifications. According to the probabilities originated from 
the GR-IND, the testing sample should be classified to class 
2. Revisiting the BBA determinated based on GR-IND, the 
basic beliefs assigned to the focal elements with single class 
has the right tend, i.e., m ({01}) > m ({02}) > m ({03}). 
However, the Pignistic probabilities originated from the GR- 
IND is counter-intuitive, where the beliefs assigned to the focal 
elements with multiple classes are counted together. From this 
perspective, the BBA determinated based on GR-IND is not 
so good. In this numerical example, the interval number based 
methods using the TD-IND, H-IND and Nq-IND perform more 
proper for the BBA determination than that using the GR-IND 
if the decision-making is based on max of BetP. 


V. EXPERIMENT 


To compare the interval number based BBA determination 
method using different INDs, we use Monte-Carlo experiments 
on the classification of the artificial set and the iris set. The 
information fusion based classification is implemented as fol- 
lows. In each classification, the interval number based method 
is used for determinating the BBA in each single feature. 
Then these multiple BBAs are combined using Dempster’s 
combination rule as in Eq. (4). Then the combined BBA 
is transformed into probabilities using Pignistic probability 
transformation as in Eq. (5). The testing sample is classified 
as the class which has the largest Pignistic probability. 

In the experiment, the interval number based methods using 
different INDs are used for determinating the BBAs respec- 


tively. In the Nq-IND, we have taken q = 2. The parameter 
a in the generation of the similarity degrees in the interval 
number based BBA determination method (as in Eq. (6)) is set 
to 5. The Monte-Carlo classification experiments are repeated 
100 times with random testing samples. The effectiveness of 
the interval number based BBA determination methods using 
different INDs are compared using the average accuracy of 
the 100 runs. 


A. Experiment on artificial set 


The artificial set generated contains 3 classes. Each class has 
50 samples, and each sample has 3 features. The features of 
different classes are generated according to Gaussian distribu- 
tion, i.e., G (u, o°). The standard deviations (o) of different 
classes’ different features are all set as o = 1. The mean 
(u) settings of different classes’ different features are listed in 
Table V. 


TABLE V 
THE MEAN (u) SETTINGS OF DIFFERENT CLASSES’ DIFFERENT FEATURES. 











Classes Mean (H) 
Feature 1 | Feature 2 | Feature 3 
Class 1 (81) | 8 5 10 
Class 2 (02) | 10 9 6 
Class 3 (83) | 5 11 9 














The features of different classes in the artificial set we 
generated are shown in Figures 2-4. 








Feature 1 
Class 3 HCH ROB k 
Class 2 kariba Mow i 
Class 1 + -HHE HH SHEE 
2 4 6 8 10 12 14 


Feature values 


Fig. 2. Artificial samples’ feature 1 of different classes. 








Feature 2 
Class 3 JORO 2 
Class 2 tasma am emeenes © 
Class 1 th HHH 
2 4 6 8 10 12 14 


Feature values 


Fig. 3. Artificial samples’ feature 2 of different classes. 


As shown in Figures 2—4, the class 3 is linearly separable 
from class 1 and class 2, and class 1 and class 2 are not linearly 
separable from each other in feature 1. Similarly, class 2 and 
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class 3 are not linearly separable from each other in feature 2, 
and class 1 and class 3 are not linearly separable from each 
other in feature 3. 








Feature 3 
Class 3 AOK MERRER 
Class 2 e summam seo ce 
Class 1 pipii 
2 4 6 8 10 12 14 


Feature values 


Fig. 4. Artificial samples’ feature 3 of different classes. 


In each Monte-Carlo run, we randomly select 25 samples 
from each class (75 samples in total) as the set of training 
samples, and the remaining samples are used as the testing 
samples. We first classify the testing sample according to the 
BBA determinated based on each single feature, respectively. 
Then, we combine the BBAs determinated based on the 3 
features, and use the combined BBA for classifying the testing 
sample. The results of the methods based on different INDs 
are listed in Table VI. 











TABLE VI 
THE RESULTS OF THE METHODS BASED ON DIFFERENT INDS. 
INDs Classification correct rate (%) 
Feature 1 | Feature 2 | Feature 3 | Combined 
GD-IND | 44.70 64.86 42.62 80.95 
TD-IND | 67.71 84.13 61.66 94.84 
H-IND 64.66 80.24 56.01 89.66 
Nq-IND | 65.86 81.68 55.84 91.97 

















In Table VI, the columns “Feature 1”, “Feature 2” and “Feature 
3” are the results of the methods using different INDs based on 
each single features. The column “Combined” are the results 
obtained by combining the BBAs determinated on different 
features with Demspter’s rule of combination. According to 
Table VI, the classifications of the methods using different 
INDs based on each single feature does not perform well. 
However, the BBAs determinated based on different features 
reflect different aspects’ information of the samples. By fusing 
the BBAs based on different features, better classification 
performances are obtained. Comparing the results of the 
methods based on different INDs, the method based on GD- 
IND performs the worst. The performances of the methods 
based on TD-IND, H-IND and Nq-IND are similar, where the 
one based on TD-IND is the best. The BBA built using the 
GD-IND is not recommended for the BBA determination. 


B. Experiment on iris set 


The iris set contains 3 classes. Each class has 50 samples, 
and each sample has 4 features. In this experiment, we 
randomly select different numbers of samples as the training 
samples (the number of the samples selected from different 


classes are the same), and all the samples are used as the 
testing samples. The results of the interval number based BBA 
determination methods based on different INDs are shown in 
Figure 5. 
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Fig. 5. Performances of the interval number based methods using different 
INDs with different scales of training samples on iris data set. 


According to Figure 5, the methods using TD-IND, H-IND 
and Nq-IND perform well in both the cases with small number 
of training samples and large number of training samples. The 
method using TD-IND performs the best compared with the 
methods using other three INDs. The results of the method 
using GD-IND have a counter-intuitive behavior, since its 
accuracy decreases with the increasing of the number of the 
training samples. When the number of training samples is 
large, the interval numbers generated can better model the 
features of corresponding classes, especially, for the mixture 
classes’ focal elements (i.e., the overlapping range of corre- 
sponding classes’ interval number models). However, as dis- 
cussed in the numerical example in section IV-B, the interval 
number based method using GD-IND is not recommended for 
determinating the BBA, especially, counting the mixture class 
focal elements together. That is why the method using GD- 
IND performs bad when the number of training samples is 
large. 


VI. CONCLUSION 


In this paper, we have tested different INDs for implement- 
ing the interval number based BBA determination method. The 
effectiveness of the BBAs are compared based on the infor- 
mation fusion based classification problems. The experiments 
validate that combining the BBAs determinated using interval 
number based methods with different INDs performs well 
for the classification problems. The methods using the TD- 
IND, H-IND and Nq-IND provide quasi similar performances, 
where the one using TD-IND is the best one. Using the 
GD-IND, the basic beliefs construction is not very effective. 
With GD-IND, the differences of the basic beliefs assigned to 
different focal elements are small, which is not discriminant 
enough for making decisions, especially, counting the mixture 
classes’ focal elements. Therefore, the method using the GD- 
IND is not recommended. 

Up to now, the interval number based BBA determination 
methods are implemented on the single feature. In future work, 
we will try to use the interval numbers for determinating 
the BBAs on the multiple features spaces, and compare the 
effectiveness of the ones using different INDs. We will explore 
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also different decision-making strategies (i.e. DSmP, min of 
d_BI, etc.), and test other rules of combination as well to see 
if we can improve classification performances. 
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